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1. INTRODUCTION 

Physiotherapy, a component of modern healthcare, is concerned with the development, maintenance 
and restoration of body movement and functionalities after illness or injury [1]. Various types of diseases and 
illness need to be cure by doing several exercises in order to manage pain and prevent diseases. 
Medical experts or therapists has been instructed to assess and cure people who have movement impairment 
and incompetence to perform daily tasks due to an injury or illness. 

Strokes, brain injuries, motor disabilities, sport injuries, post-accident injuries and Parkinson disease 
are the examples of diseases that undergo physiotherapy. Stroke, also known as cerebrovascular accident or 
brain attack, is one of the top five leading causes of death and one of the top 10 causes for hospitalization in 
Malaysia [2]. According to the World Health Organization (WHO), stroke ranks as the second leading cause 
of death. Physiotherapy can help stroke’s patients gaining their muscle control and strength back depending 
on the severity of the stroke. 50% of stroke survivors endure from disabilities of motor function that requires 
continuous rehabilitation [3]. 


1.1 Types of Exercises 
Patients usually will be given exercises or training within their own pace and tolerance levels. 
The training that adequate for a patient may not be equally adequate for others. There are several types of 
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physical therapy exercises which depends on patients’ particular condition and physical capabilities. Some of 
them are as: 

1. Range of Motion 

2. Muscles Strengthening Exercises 

3. Balance Exercises 

4. Flexibility Exercises 

5. Post-Surgery Exercises 


Table | summarize the types of exercises together with the examples and conditions. 


Table 1. Types of Exercises Correspond to the Conditions 








nace a nerapy Explanation Examples of Exercises Conditions 
Exercises 
Range of Motion Range of Motion (ROM) exercises helps Active ROM Arthritis 
you move your joints to prevent stiffness. Passive ROM sport injuries 


Muscles Strengthening 
Exercises 


Balance Exercises 


Endurance Exercises 


Flexibility Exercises 


Post-Surgery Exercises 


Increasing muscles strength to gain better 
balance, mobility and ability to enjoy a 
normal lifestyle. 


Balance exercises can help people with 
balance or people who have muscles 
weakness preventing them from sudden 
falls 

Increase breathing and heart rate improving 
the health of lungs and heart also improve 
person’s overall fitness. 

Stretching can help improve human body to 
become more flexible and limber. 


Surgery patients experienced pain, muscle 
contractions and stiffness. Physical therapy 
can relieve these issues by gradually 
adjusting the physical conditioning 


Active-assisted ROM 


Squat 

One-Arm Row 
Modified Push Up 
Shoulder Press 
Knee Extension 
Bridging 

Sit to Stand 
standing one foot 
walking in a straight 
line 

Yoga, Tai Chi 
Waking 

Stairs Climbing 


Hamstring Stretch 
Chest Stretch 
Calf Stretch 

Back Stretch 
Shoulder Stretch 
Head Lift 
Buttock Lift 
Walking 


Post-Surgical Healing 
Caution 

weight control 
pulmonary diseases 
Stroke 

Heart diseases 


Stroke 

Cardiac event 

Elderly 

Low Blood Pressure 
Parkinson’s 
Cardiovascular diseases 


Back pain 
Disc Diseases 
Parkinson’s 


Depends on body parts 
surgeries 





Currently, the patients need to attend therapists at the clinic and it’s such a burden for the caregiver 


and the patients itself. It’s so inconvenient for the patients especially for elderly and bed-ridden patients, to 
go back and forth once in a week for the physiotherapy. Aside from that, patients may need to wear assistive 
devices such as sensors throughout the session. This leads to unpleasant training for the users. In the other 
hand, there are limited physical therapy equipment and therapists allocated in clinic or hospital. Therefore, 
patients may have to wait until their turn to perform the exercises as the therapists unable to assess them or 
the equipment is being used by other patients. 


1.2 Limitations of Exergames and Serious Games 

Recent studies have demonstrated that Kinect Sensor can be utilized to evaluate clinically-relevant 
parameters of gait [4], [5] and posture [6]. Kinect-based virtual stepping treatment has been appeared to be 
powerful for post-stroke rehabilitation of gait [7].Boundless of applications and research on Kinect-based 
were conducted even now. The most leading applications on Kinect-based for rehabilitations exercises are 
Exercise Games and Serious Games. 

Exercise games, known as exergames intent to integrate natural human motion and _ the 
entertainment to promote elderly exercise while serious games aim to concurrently rehabilitate motor- 
impaired users and monitored patients’ progress. However, exercise games and serious games must come to 
the limits as the requirements for clinical data capture for specific limb movements cannot be achieved. 
Below are the summarization of the limitations for exergames and serious games [8]: 

1. Designated games particularly for symptomatic use are restricted to non-occluding movements. 

This implies that standard stroke impairment level tests requiring pervasive occluding movement 

sets may be illogical for a Kinect-based system to capture. 
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2. Diagnostic potential for extremities is limited to gross movements, as fine movements of the hand 
and foot are currently outside the Kinect’s capture sensitivity. 
Games targeted at rehabilitation may be prone to “cheating” which means unnatural. 
4. Appropriate response to failure and poor performance, if not accounted for during game design, 
can inherently limit positive outcomes due to demotivation 
5. The advantages of the games mainly considered for here and now with small sized studies. 

In addition, the games itself having cons as they are not suitable for all ages. For an example, 
exergames and serious games are built only for young and middle ages people. It is inadequate for elderly as 
elderly might have secondary disabilities such eye-sight, hearing, speech problems thus they can’t focus on 
the screen as well as the games. The games also irrelevant for bed-ridden patients. 


io) 


2. HUMAN ACTION RECOGNITION 

Vision based human action recognition is an orderly way to recognize and perceive the movement 
of people in camera captured content. It composes of fields such as Biomechanics, Machine Vision, 
Image Processing, Artificial Intelligence, and Pattern Recognition [9]. Human activities can be classified into 
four categories which is actions, gestures, interactions and group activities [10]. Motion recognition 
composes of many actions such as walking, sitting, standing, running, waving, etc. Figure | defines the steps 
in human motion recognition system. 


ff 
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Figure 1 General framework of human action recognition 


2.1 Detection 

Human detection is an early stage for a human action recognition. Detecting human can be divided 
by many categories depends on human body parts such as upper limb, lower limb, hands, arms, head, 
legs etc. Each of body parts need different detection methods as to match the accuracy of the evaluation. 

Computer vision and machine learning algorithm has been adopted to confront the problem of 
human detection in videos. There are many solutions and techniques introduced based on various scenarios 
including variations in illumination and poses, as well as background clutter. Dalal & Triggs report 
impressive results on human detection [11] by implementing Histogram of Gradient (HoG) as low-level 
features and outperformed other features such as wavelets[12], PCA-SHIFT [13] and shape contexts[14]. 
Zhu et al. proposing a rejection cascade using HoG features to improve the detection speed [15] whilst Zhang 
et al. come up with a multi-resolution framework in order to cut down computational cost[16]. 

In contrast, Lowe proposed the Scale-Invariant Feature Transform (SIFT) which has high accuracy 
and low computation time [17] which also being employed by Khaledian et al. for hand gesture recognition 
[18]. Ke and Sukthankar attempts to further improve the method by introducing PCA-SIFT [13]. 
Next, Speeded Up Robust Features by Bay et al. being introduced as it shown to yield comparable or better 
results to SIFT[19]. 

Perhaps, the most recent, promising approach is detection by RGB-D camera such as Kinect sensor. 
This detection can be split into two common methods which is Skeletal Joints and Depth Mapping. 
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I. Skeleton Joints 

Skeleton joints involve in combination number of joints which define body parts such as head, 
shoulders, neck and arms. This process describes by a very huge number of dimensions and its describes 
unique individuals such as their shapes, sizes, postures, motions, etc. Each version of Kinect has different 
number of joint types that made up a skeleton. For version | (Figure 2a) is made up 20-joint types, 
while version 2 (Figure 2b) made up of 25-joint types with additional 5 joints from Kinect v1. The additional 
joints in Kinect v2 are Spineshoulder, HandTipLeft, ThumbLeft, HandTipRight, and ThumbRight. 
The details of each joint number have been explained in Table 2. 
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Figure 2. (a) 20-Joints of human body in action recognition through Kinect Version 1; 
(b) 25-Joints of human body in action recognition through Kinect Version 2 


Table 2. Kinect Skeleton Joints type for V1 and V2 








20 -Joints Kinect V1 25-joints Kinect V2 

0 HipCenter 10 ~=WristRight 0 SpineBase 13 KneeLeft 

1 Spine 11 HandRight 1 SpineMid 14 AnkleLeft 

2 Shoulder Center 12 HipLeft 2 Neck 15 FootLeft 

3 Head 13. KneeLeft 3 Head 16 HipRight 

4 ShoulderLeft 14 AnkleLeft 4 ShoulderLeft 17 KneeRight 

5 ElbowLeft 15 FootLeft 5 ElbowLeft 18 AnkleRight 

6 WristLeft 16  HipRight 6 WristLeft 19 FootRight 

7 HandLeft 17. KneeRight 7 HandLeft 20 SpineShoulder 

8 ShoulderRight 18  AnkleRight 8 ShoulderRoght 21 HandTipLeft 

9 ElbowRight 19  FootRight 9 ELbowRight 22 ThumbLeft 
10 = WristRight 23 HandTipRight 
11. HandRight 24 ThumbRight 
12 HipLeft 





Skeleton Joints features are incompatible to work alone as it inadequate to identify various human 
actions. Hence, there are many developed novels of visual representations and machine learning methods in 
order to fully achieved skeleton features in human action recognition. 

Raptis et al. successfully used skeleton positions in a real time dance classification by employing 
Principle Component Analysis (PCA) on torso joints positions. This is done to determine a human torso 
surface as well as defining a human pose with the spherical angles within the limb joints postions and torso 
surface [20]. Fourier transform also being utilized over time to describe the temporal structure of actions. 

Yang et al. proposed a new type of features based on position differences between joints, Eigenjoint, 
to represent actions which combine action information [21]. The positions difference is extracted from all the 
pairs in one frame, the joint of continuing frames and the joints of the initial frame with another frame to grab 
the structural of human postures. They also applied PCA to the features to extract the crucial data for action 
recognition, “eigenjoints”, then undergo action classification by applying nearest-neighbor classifier. 

Xia et al. introduced a novel approach for human action recognition with Histogram of 3D joint 
locations (HOJ3D). It extracts the histogram of spherical coordinates of the joint positions in a coordinates 
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system that uses the hip joint as origin [22]. They also employed Shotton et al.’s method to extract 3D joint 
location from a depth image by employing a local mode-finding approach based on mean shift with a 
weighted Gaussian Kernel to compute the confidence-scored 3D positon estimation of body joints [23]. 
Chaundry et al. illustrated bio-inspired dynamic 3D discriminative skeleton feature by using linear 
dynamic systems to model the dynamic medial axis structures of human parts. The paper considered a 
discriminative metric to compare sets of linear dynamics system for action recognition. Table 3 summarizes 
relevant papers on Kinect based for physiotherapy and assessment using skeleton joints. Each paper proposed 


different methods for different rehabilitation exercises. 
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Table 3. Kinect Skeleton Joint features for Physiotherapy and Assessment 





First author / Year 


Diseases & Exercises / motion 


Dataset 


Method used 





J. Venugopalan, 2013. 


[24] 


H. Jiang, 2013. [25] 


T.Y. Lin, 2013. [26] 


R. Staab, 2014. [33] 


H. D. Rosario, 2014. 


[45] 


S. Li, 2014. [46] 


J.D.Lee, 2014. [27] 


D Anton, 2015. [30] 


Q. Wang, 2015. [47] 


Cappecci. M, 2016. [28] 


S. Sinha, 2017. [48] 


J. Richter, 2017. [31] 


S. H. Han, 2017. [29] 


Traumatic Brain Injury 
-hand waving sideways 
-horizontal stretch 

-Vertical stretch 

-left and right hand extended 
-front hand stretch 

Stroke with Hemiplegia 
-Legs and Arms motion 


Parkinson’s Diseases 

-Seated Tai Chi Exercises with 
18 forms. 

-jumping jacks 

-arm circle 

-arm curls 

-motion of limbs 

-trunk, shoulder and hips. 


Rehabilitation Exercises 


Movement Disorder 

-Tai-Chi exercises 

Shoulder disorders 

-hands to mouth 

-shoulder extension 

-shoulder flexion 

-hands to head 

12 different exercises include 6 
sitting poses 


Physical impairment and 
disabilities 

-Arm lifting 

-squatting 

-pelvis rotation 

-trunk rotation 

-trunk tilting 

Stroke 

neurological disorders 
-Active ROM exercises 
-Hip abduction exercises 


Recovery states 
-postural correction 


-20 joints (x, y, z) 
-Quarternion (w, x, y, Z) 
-30 frames 

-4 subjects (22-30 yo) 


-20 joints (x, y, z) 

-10 joints (x, y, z) 

-2 subjects (60 & 91 yo) 
-20 joints (x, y, z) 

-12 joints (x, y, z) 

-20 joints (x, y, z) 

-10 joints 

-1 subjects (60 yo) 


-20 joints (x, y, z) 
-15 subjects (44- 83 yo) 


20 joints and 14 joints 


-25 joints (x, y, Z) 
-33 subjects (22 — 72 yo) 


20 joints (x, y, Z) 


20 joints (x, y, Z) 


-20 joints (x, y, z) 


-Skeleton Normalization 
-Direct comparison 
-Cross Correlation 
-Dynamic Time Warping 


-Skeleton Normalization 
-Dynamic Time Warping 


-Skeleton normalization 
Kolomogorov-Smirnov test 


-SVM 

-SVM with sigmoid Kernel 
-K-NN 

-NITE algorithm 

-Particle Filter 


-Proposed algorithm 
-Kalman Filter 


-Skeleton Normalization 
-Fuzzy Logic 


-trajectory recognition 
-Dynamic Time Warping 


-Unscented Kalman Filter 
-Kinematic Filtering 


-Zero Velocity Crossing 
-Hidden Semi-Markov Model 
-Dynamic Time Warping 


-Point Cloud Segmentation 
-Cylinder Model Fitting 
-Kalman Filter 

-Local and Normalised 
Hierarichal Coordinates 
-Increment DTW 

-SVM 

-Z-Score Normalization 
-Deep neural network 
-Deep learning algorithm 





Janani et al applies skeleton normalization for pre-processing data to overcome discrepancies when 
real time quantitative assessment of exercises performed by TBI patients at home matched with the template 
exercises performed in the clinic [24] as well as in Jiang et al [25], Lin et al [26], Lee et al [27], they employ 
normalization to compare and evaluate different skeleton models. Whereas, Cappecci e al utilizes 
Zero-Velocity Crossing (ZVC) to locate starting and ending points for human motion segmentation in order 
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to identify moving points [28]. Han et al scaling the values of joint points by implementing Z-Score 
Normalization as to improve the performance of deep learning algorithm [29]. 

Further on exercises assessment, Cappecci et al [28] compared a Histogram Semi-Markov Model 
(HSMM) based algorithm to monitor and evaluate rehabilitation exercises with Dynamic Time Warping 
algorithm resulting HSMM outperformed DTW as HSMM demonstrated scores correlated better with the 
clinical scores. However, Janani et al [24] stated that DTW surpass Direct Comparison and Cross-Correlation 
method as DTW able to give a large score and achieve higher separation between most of the similar and 
dissimilar videos. DTW also being used by Jiang et al [25] and Anton et al [30]. 

In the other hand, Ritcher et al intend to give continuous evaluation to the patients, hence, they employed 
the extension of DTW which is Incremental DTW [31] introduced by Khan et al. [32] that delegates the 
comparison between references exercises and exercises that patient currently performs. Then, they classify 
the motion using hierarchical SVM. Staab also implements SVM and also SVM with sigmoid Kernel to train 
some motion exercises since each exercise has a unique distribution in feature space [33]. Thus, 
concludes that a model that performs well for one exercise might not be suitable for another. 

In brief, skeleton joints can provide reliable joint coordinates to the users with its real-time skeleton 
estimation algorithm. Its also has drawn a great attention [33-43], as it brings a great robustness to 
illumination, clustered background, and camera motion. 

Il. Depth Maps 

Depth imaging technology has advanced adequately over last few years, finally reaching a consumer 
price point with the launch of Kinect. Depth images provide depth information of an object or also known as 
z-information of an object in a real world. Depth maps can be collected through Stereo Camera, 
Laser Triangulation etc. It is also widely used in many 3D vision algorithms recently. The intensity values in 
an image represent the distance of the object from a viewpoint. As illustrated in Figure 3a) there is a bottle 
and an umbrella in an area. The depth image in Figure 3b) shows luminance in proportion to the distance of 
an object from the camera. The nearer object to the camera which is the bottle is darker while the further 
object, the umbrella, is lighter. 





(b) 


Figure 3. Example of comparison between normal and Depth image using Kinect v2. 
a) Normal Image; b) Depth Image 


In the last few years, solutions for activity recognition have been presented, they intended to extract 
features from depth data such as [49] where they presenting the adaptive spatial-temporal pyramid to 
improving in retaining the spatial and temporal orders. Truong et al. developed a simple novel method from 
hand gesture recognition that achieve accurately in real time using depth information from Kinect Sensor 
[50]. Author applying thresholds to the hand point that tracked by Bayesian Object Localization method in 
the depth image to determine the hand region.Next, Samad et al applied background segmentation with 
improved adaptive Gaussian mixture algorithm to the depth map to detect moving obejcts [51]. 

Whereas, raw depth map has been smoothen by applying two filtering methods which are pixel 
filtering and context filtering [52]. Then, the depth map encoded through proposed Local Ternary Direction 
Pattern (LTDP) feature descriptor and utilized by SVM classifier. The result turns out that LTDP 
outperformed others five existing descriptors (LBP, LTP, HOG, PHOG, CENTRIST) and the nonlinear effect 
of SVM classification task were reduced by using LTDP on depth map. 

Yang et al. proposed HOG in Depth Motion Maps (DMM-HOG) which applies the HOG descriptor 
on depth motion maps. It is computed by taking the difference of the depth maps in two consecutive frames, 
thresholding the difference, and aggregating the difference over time. Then, they extract DMMs from the 
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front, top and side views [53]. Next, Xia et al. present a novel human detection method by depth information 
by Kinect and the results can adequately detect the persons in all poses and appearances also provide exactde 
estimation of the whole body contour of a person [54]. 

On the other hand, Ni et al. propose a method for action recognition by combining depth maps with 
RGB videos. The method is done by identifying the interest points in RGB videos, extracting HOG/HOF 
features and LDP features from RGB videos and depth sequences, respectively and concatenating the RGB 
and depth map features [55]. It shows that the information in RGB videos and depth map sequences are 
complementary to each other. There are massive studies on depth images approaching rehabilitation and 
physiotherapy assessment has been illustrated in Table 4. 

Bakar et al [56] and Sosa et al [57] employs Region of Interest (ROI) in the segmentation phase, 
to minimize the area and remove unwanted objects that appears around while Sinha et al [58] proposed an 
algorithm in Depth-based segmentation and PCA to improve accuracy of Kinect for upper body rehabilitation 
applications. 

Next, Yao et al [59] proposed a Kinect-based rehabilitation system for both therapists and patients. 
They evaluate their time sequence-based data by implementing Cross Correlation as the method is well 
known in detecting common periodicities. They also employed DTW to compare whether the patients done 
the exercises as the same rate as the skeleton frame sequences. It is to compare and find optimal alignment 
between two given time-sequences. Furthermore, Ye et al [60] utilizes DTW to compute a distance matrix for 
gait pattern extraction while Su et al [61] applying DTW measure the similarity of joint data between 
“at home exercises ”and“ in hospital exercises”. 

Su et al then evaluate the performance by using Adaptive Neuro-Fuzzy Inference System (ANFIS) 
which integrates a neural network and a fuzzy logic [61]. Nomm et al practiced Neural Network based 
model, NN-based ANARX (Additive Nonlinear Auto Regressive exogenous), in their monitoring system as it 
can adjust the system according to the specific needs of each patient [62] whereas Ye et al used NN-based on 
nonlinear autoregressive with exogenous (NARX) for gait phase classification and Enhanced Random 
Decision Forest (ERF) for missing features cases [60]. Next, Nahavandi et al trained a Random Decision 
Forest (RDF) for generalising a learning model in order to discriminate between seven RULA-scored sets of 
postures [63]. 

In contrast, Collins et al achieved to recognize several human actions done by stroke patients with 
high certainty by employing HON4D as a global descriptor [64]. Nghia et al proposed an algorithm to 
compute discriminative features, depth of wrist, by building a mapping table between the differences of bone 
joint depth and head depth as Kinect provided [65]. Consequently, depth maps is a good approach for 
detecting human action as it insensitive to changes in lighting conditions, hence it even works in low ambient 
light condition [66]. Depth maps properties also convenience to works with specific feature descriptors which 
spikes their evaluation performance [48], [65]. 


2.2 Recognition and Classification 

Recognizing human activities in an video sequences or images can be quite challenging due to 
complication of an area [68] such as background clutter, partial occlusion, change in scale, lighting etc. 
Therefore, classification is needed to classify and recognize the action of a human action to solve the 
recognition and localization problem. 

Hidden Markov Models (HMMs) are known to have high classification rates and are favoured for 
classifying dynamic gestures. [67]- [71] all use HMMs straightforward manner for gesture classification. 
Whilst, [74] employed Riemannian shape space to represent distance between curves and using k-Nearest 
Neighbour (k-NN) as the classifier. kK-NN classifiers prominent for static pose also because of their high 
classification rates other than simple to be implemented. 

In [23], Randomized Decision Forests which is a forest is an ensemble of trees, is be used as the 
classifier and have been proven faster and effective classifiers for many tasks [73-75] and can be 
implemented efficiently on the GPU [78]. Random forests also being implemented in [77], [78] for action 
recognition. 

On the contrary, Neural Networks (NN) and Support Vector Machine (SVM) are also commonly 
used for poses and motion recognition. Suriani [81] employs SVM to identify a person state whether they in 
a normal or anomaly movement for fall detection while doing home-based rehabilitation exercises. Ciresan et 
al. used Convolutional Neural Networks (CNN) [82], while Toshev et al. implement Deep Neural Networks 
(DNN) to recognize human poses and activities [83]. Du et al. divided human skeleton into five segments and 
used each of the parts to train a hierarchical recurrent neural networks [36]. 
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Table 4. 


Kinect Depth Map Features for Physiotherapy and Assessment 








First author / Year Diseases & Exercises / motion Dataset Method used 
Y. J. Chang, 2013. [84] Cerebral Palsy -Depth Information -Kolmogorov-Smirnov 
Upper limb rehabilitation Test 
B.Penelle, 2013. [85] Lower limb -Depth Images - GPU based 
Whole Body -Particle Filter 


S. Nomm, 2013. [62] 


T. Watanabe, 2014. [86] 


L. Yao, 2014. [59] 
C. J. Su, 2014. [61] 


M.Z. A. Bakar, 2015. [56] 


L. Omelina, 2016 [87] 


G. D. Sosa, 2015. [57] 


S. Sinha, 2016. [58] 


M. Ye, 2017. [60] 


D. Nahavandi, 2017. [63] 


J. Collins, 2017. [64] 


V. T. T. T. Nghia, 2017. 
[65] 


R. Samad, 2018. 


Motor Functions 
Therapeutic exercises for human 
limbs 

-Dynamic ROM 

Elderly 

Lower Limb Chair Exercise 
-toe lift 

-heel lift 

-Knee extensions 

-thigh Lifts 

-open-leg exercises 
-Rehabilitation Exercises 
Post-injuries 

-shoulder rehab exercises 


Wrist Hand Injury 
Hand Deviation 


Face recognition 


Multiple Sclerosis 
-shoulder elevation 
-shoulder abduction 
-hip abduction 
Upper body 

-ROM exercise 


-Stroke 
-walking exercises 


Musculoskeletal Disorder 
-Shoulder 
-Elbow 

-Trunk 

-Neck 

Stroke 

- arm extensions 
-chest sway 
-waking 
-Flexion 
-Elevation 
-Abduction 


-Depth information 


-Depth Images 
-7 subjects (75 — 85 yo) 


-Depth Images 
-depth information 


-3D data 
-Depth Data 
-RGB Data 
-Depth Images 


-RGB-D 
2D with 30fps 
-4 subjects (24 — 39 yo) 


-RGB Depth 
-10 subjects (21- 55yo) 


-depth map 


-depth images 
-Rapid Upper Limb 
Assessment Scoring (RULA) 


-depth data 


-skeleton joint 
-depth images 


-Neural Networks 
-NN-based ANARX 


-Average Recognition 
Rate 


-DTW 

-DTW 

-Neural Network 

-Fuzzy Logic 

-Region of Interest (ROT 
-Hand Contour 

- K-Curvature 

-Local Binary Patterns 
with Chi Square 
-K-Mean 

-Region of Interest (ROT) 
-background subtraction 
-human silhouette 
-skeletonization 
-Principal Component 
Analysis (PCA) 
-Proposed algorithm 
-DTW 

-NN based algorithm 
(NARX) 

-Kernel Filter 

-Enhanced Random 
Decision Forest 

-DCF Feature Extraction 
-Random Decision Forest 


-Histogram of Oriented 4D 
Normal (HON4D) 


-proposed algorithm 





3. LIMITATIONS AND FUTURE WORKS 

This paper has studied several methods for Kinect-based physiotherapy and assessment that can be 
used to develop technology and innovative services in assisted living applications. However, there are 
limitations and consequences within the methods discussed. 

In the context of skeleton joints, detection may become less accurate and robust when joints not 
labelled properly as it leads to confusion of points especially between the ankle and knee points. Thus, it will 
be resulting to false detection.Furthermore, determining inner joints such as elbow and knee joints that are 
hidden and being kept out also caused unsastified results [88]. 

Next, we disclosed another constraint which is high interclass similarity where similar joints are 
being detected for different activities. For an example, in CAD-60 dataset, both drinking water and brushing 
teeth activities were including the similar joints and appeared to be errornous on classifying each of the 
activity. In addition, we also found another challenging reason, high intraclass variability limitations. 
There are several same actions that being performed different ways by the same subject for example, 
the subject performed the same action by left hand or right hand, or both hand indifferently. 
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However, for depth map, occlusion is the most problem encountered by researchers. Human image 
sometimes being occluded by another objects or persons hence degrading the detection performance [89]. 
Also, using only one depth camera can increases the persistent of background or foreground occlusions. 
Another drawback that being focused on depth map is cluttered background which eventually causing poor 
detection as the subject is not clear and difficult to detect. Other than that, as the distances to the camera 
increases, the accuracy of depth maps degrades and noisy. 

Hence, in the future, we intend to develop an algorithm or adapting some existing algorithm in order 
to minimize the limitations. Next, we will be considering more adequate models and combine them to further 
enhance the performance of detection under our proposed framework. 


4. CONCLUSION 

Human activity understanding has attracted widespread interest for many researchers. This field 
becomes one of the most active research topics in computer vision. We have provided a comprehensive 
analysis of the existing, publicly available of Kinect-based Physiotherapy and Assessment. There are various 
applications of Kinect in the field of Physiotherapy and Assessment recently. We have explained types of 
exercises, limitations of exergames and serious games for rehabilitation. These direction in future works are 
vast to enhance methods for Kinect Assessment system to accurately for a patient completing rehabilitation 
exercises, also the data can be clinically used. 
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